The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
卷积神经网络已使基于医学图像的诊断有了重大改进。但是,越来越明显的是,这些模型在面对虚假的相关性和数据集转移时易受性能降解,例如,领导者(例如,代表性不足的患者群体的表现不足)。在本文中,我们比较了ADNI MRI数据集上的两个分类方案:使用手动选择的体积特征的简单逻辑回归模型,以及对3D MRI数据训练的卷积神经网络。我们在面对不同的数据集拆分,训练集的性别组成和疾病阶段的情况下评估了受过训练的模型的鲁棒性。与其他成像方式中的早期工作相反,我们没有观察到培训数据集中多数组的模型性能的明确模式。取而代之的是,尽管逻辑回归对数据集组成完全可靠,但我们发现,在培训数据集中包括更多女性受试者时,男性和女性受试者的CNN性能通常会提高。我们假设这可能是由于两性病理学的固有差异。此外,在我们的分析中,Logistic回归模型优于3D CNN,强调了基于先验知识的手动特征规范的实用性,以及需要更强大的自动功能选择。
translated by 谷歌翻译
The application of Natural Language Processing (NLP) to specialized domains, such as the law, has recently received a surge of interest. As many legal services rely on processing and analyzing large collections of documents, automating such tasks with NLP tools emerges as a key challenge. Many popular language models, such as BERT or RoBERTa, are general-purpose models, which have limitations on processing specialized legal terminology and syntax. In addition, legal documents may contain specialized vocabulary from other domains, such as medical terminology in personal injury text. Here, we propose LegalRelectra, a legal-domain language model that is trained on mixed-domain legal and medical corpora. We show that our model improves over general-domain and single-domain medical and legal language models when processing mixed-domain (personal injury) text. Our training architecture implements the Electra framework, but utilizes Reformer instead of BERT for its generator and discriminator. We show that this improves the model's performance on processing long passages and results in better long-range text comprehension.
translated by 谷歌翻译
Attention-based multiple instance learning (AMIL) algorithms have proven to be successful in utilizing gigapixel whole-slide images (WSIs) for a variety of different computational pathology tasks such as outcome prediction and cancer subtyping problems. We extended an AMIL approach to the task of survival prediction by utilizing the classical Cox partial likelihood as a loss function, converting the AMIL model into a nonlinear proportional hazards model. We applied the model to tissue microarray (TMA) slides of 330 lung cancer patients. The results show that AMIL approaches can handle very small amounts of tissue from a TMA and reach similar C-index performance compared to established survival prediction methods trained with highly discriminative clinical factors such as age, cancer grade, and cancer stage
translated by 谷歌翻译
Nucleolar organizer regions (NORs) are parts of the DNA that are involved in RNA transcription. Due to the silver affinity of associated proteins, argyrophilic NORs (AgNORs) can be visualized using silver-based staining. The average number of AgNORs per nucleus has been shown to be a prognostic factor for predicting the outcome of many tumors. Since manual detection of AgNORs is laborious, automation is of high interest. We present a deep learning-based pipeline for automatically determining the AgNOR-score from histopathological sections. An additional annotation experiment was conducted with six pathologists to provide an independent performance evaluation of our approach. Across all raters and images, we found a mean squared error of 0.054 between the AgNOR- scores of the experts and those of the model, indicating that our approach offers performance comparable to humans.
translated by 谷歌翻译
Mitotic activity is key for the assessment of malignancy in many tumors. Moreover, it has been demonstrated that the proportion of abnormal mitosis to normal mitosis is of prognostic significance. Atypical mitotic figures (MF) can be identified morphologically as having segregation abnormalities of the chromatids. In this work, we perform, for the first time, automatic subtyping of mitotic figures into normal and atypical categories according to characteristic morphological appearances of the different phases of mitosis. Using the publicly available MIDOG21 and TUPAC16 breast cancer mitosis datasets, two experts blindly subtyped mitotic figures into five morphological categories. Further, we set up a state-of-the-art object detection pipeline extending the anchor-free FCOS approach with a gated hierarchical subclassification branch. Our labeling experiment indicated that subtyping of mitotic figures is a challenging task and prone to inter-rater disagreement, which we found in 24.89% of MF. Using the more diverse MIDOG21 dataset for training and TUPAC16 for testing, we reached a mean overall average precision score of 0.552, a ROC AUC score of 0.833 for atypical/normal MF and a mean class-averaged ROC-AUC score of 0.977 for discriminating the different phases of cells undergoing mitosis.
translated by 谷歌翻译
Much recent work in task-oriented parsing has focused on finding a middle ground between flat slots and intents, which are inexpressive but easy to annotate, and powerful representations such as the lambda calculus, which are expressive but costly to annotate. This paper continues the exploration of task-oriented parsing by introducing a new dataset for parsing pizza and drink orders, whose semantics cannot be captured by flat slots and intents. We perform an extensive evaluation of deep-learning techniques for task-oriented parsing on this dataset, including different flavors of seq2seq systems and RNNGs. The dataset comes in two main versions, one in a recently introduced utterance-level hierarchical notation that we call TOP, and one whose targets are executable representations (EXR). We demonstrate empirically that training the parser to directly generate EXR notation not only solves the problem of entity resolution in one fell swoop and overcomes a number of expressive limitations of TOP notation, but also results in significantly greater parsing accuracy.
translated by 谷歌翻译
从教育和研究的角度来看,关于硬件的实验是机器人技术和控制的关键方面。在过去的十年中,已经介绍了许多用于车轮机器人的开源硬件和软件框架,主要采用独轮车和类似汽车的机器人的形式,目的是使更广泛的受众访问机器人并支持控制系统开发。独轮车通常很小且便宜,因此有助于在较大的机队中进行实验,但它们不适合高速运动。类似汽车的机器人更敏捷,但通常更大且更昂贵,因此需要更多的空间和金钱资源。为了弥合这一差距,我们介绍了Chronos,这是一种具有定制开源电子设备的新型汽车的1/28比例机器人,以及CRS是用于控制和机器人技术的开源软件框架。 CRS软件框架包括实施各种最新的算法,以进行控制,估计和多机构协调。通过这项工作,我们旨在更轻松地使用硬件,并减少启动新的教育和研究项目所需的工程时间。
translated by 谷歌翻译
基础模型在AI的所有应用中都被认为是一个突破性的突破性,有望进行功能提取的可重复使用的机制,从而减轻了对特定于任务的预测模型的大量高质量培训数据的需求。但是,基础模型可能可能编码甚至加强历史数据集中存在的现有偏见。鉴于仔细检查基础模型的能力有限,尚不清楚机会是否超过了临床决策等安全关键应用中的风险。在我们对最近发布且可公开可用的胸部X射线基础模型的统计偏差分析中,我们发现了关注的原因,因为该模型似乎编码了受保护特征,包括生物学性别和种族认同,这可能会导致下游亚组的各个子群体不同申请。尽管针对医疗保健应用的基础模型的研究处于早期阶段,但我们认为,让社区意识到这些风险以避免伤害很重要。
translated by 谷歌翻译
降级扩散概率模型(DDPM)是最近获得最新结果的生成模型系列。为了获得类条件生成,建议通过从时间依赖性分类器中梯度指导扩散过程。尽管这个想法在理论上是合理的,但基于深度学习的分类器臭名昭著地容易受到基于梯度的对抗攻击的影响。因此,尽管传统分类器可能会达到良好的精度分数,但它们的梯度可能不可靠,并可能阻碍了生成结果的改善。最近的工作发现,对抗性稳健的分类器表现出与人类感知一致的梯度,这些梯度可以更好地指导生成过程,以实现语义有意义的图像。我们通过定义和训练时间依赖性的对抗性分类器来利用这一观察结果,并将其用作生成扩散模型的指导。在有关高度挑战性和多样化的Imagenet数据集的实验中,我们的方案引入了更明显的中间梯度,更好地与理论发现的一致性以及在几个评估指标下的改进的生成结果。此外,我们进行了一项意见调查,其发现表明人类评估者更喜欢我们的方法的结果。
translated by 谷歌翻译